Method for generating descriptive trace gaps

ABSTRACT

A method of managing a debug trace data stream by detecting conditions where the trace data generated exceeds the available transmission bandwidth, and inserting a trace data gap into the trace data stream. The gap may contain additional information relating to the amount and type of data that is being lost during the overflow condition. In an alternate embodiment the generated trace may be throttled to ensure the available bandwidth is not exceeded.

TECHNICAL FIELD OF THE INVENTION

The technical field of this invention is debug trace generation.

BACKGROUND OF THE INVENTION

Debugging of embedded solutions has always been a difficult job. Asprocessors become faster and more complex, debugging and developmentwith the current debug technology becomes more difficult. In order toaddress these complex issues, greater visibility into the programoperation is needed. Three areas in which greater visibility is desiredare program counter tracing, cycle accurate profiling, and load andstore data logging. Access to this data may be available through adedicated Debug Port. However, each of these problems demands atremendous amount of information. Simply supplying a large number ofhigh frequency pins to view all of this data is neither practical norcost effective, and an encoding scheme is needed to further compress allof this data. An encoding has been used that encodes Program Counter(PC) tracing, cycle accurate timing of all instructions, and load andstore data logging. All of this data can be transmitted across the samepins on the Debug Port.

The debug port is a tool that provides for the export of software orhardware generated trace information to an external recorder. The traceport utilizes a transmission format that addresses the requirementswithout noticeably compromising the format efficiency for any givenimplementation. The format primitives are viewed as a trace exportinstruction set. All processors use this instruction set to describe thesystem activity within a device. Each processor can describe the systemactivity in any manner that uses the instruction set and the rule setgoverning its use.

It is important to note that the external transmission rates/pins arefixed by the deployed receiver technology. These rates will remainrelatively constant over time. This implies that as CPU clock ratesincrease, there will be increasing pressure to optimize the format toget the most compressed representation of system activity. This will benecessary just to maintain the status quo. Fortunately, the transmissionformat used provides an efficient means to represent the systemactivity. However, this efficiency comes at the expense of a largeron-chip hardware expenditure in order to gain the compressionefficiency. This gives the processors the capability to improve theefficiency of their export bandwidth as it is stressed by CPU clock rateincreases. The steady march to faster CPU clock rates and densermanufacturing processes will necessitate taking advantage of allcompression opportunities and the best available physical transmissiontechnology.

The format is designed to provide designers the ability to:

Optimize bandwidth utilization (most real information sent in minimumbits/second)

Chose less efficient but more cost effective representations of systemactivity

Mix of both of the above approaches (i.e. optimize PC trace transmissionefficiency while implementing less efficient memory access export)

This gives different processors the ability to represent their systemactivity in forms most suitable to their architecture.

Tradeoffs has to be made since there are numerouscost/capability/bandwidth configuration requirements. Adjustments can bemade to optimize and improve the format over time.

The transmission format remains constant over all processors while thenature of the physical transmission layer can be altered. Thesealterations can take three forms:

Transmission type (differential serial or conventional single ended I/O)

Number of pins allocated to the transmission

Frequency of the data transmission

This means that the format representing the system activity can and isviewed as data by the actual physical mechanism to be transmitted. Thecollection and formatting sections of the debug port should beimplemented without regard to the physical transmission layer. Thisallows the physical layer to be optimized to the available pins andtransmission bandwidth type without changing the underlying physicalimplementation. The receiver components are designed to be both physicallayer and format independent. This allows the entire transmit portion toevolve over time.

A 10-bit encoding is used to represent the PC trace, data log, andtiming information. The trace format width has been decoupled fromnumber of transmission pins. This format can be used with any number oftransmission pins. The PC trace, Memory Reference information, and thetiming information are transmitted across the same pins.

Packets can contain opcodes or data, or both. A code packet contains anopcode that indicates the type of information being sent. The opcode canbe 2 to 10 bits long. The remainder of the code packet will hold dataassociated with that opcode.

In many cases, additional data needs to be associated with an opcode.This data is encoded in subsequent packets referred to as data packets.Data packets contain information that should be associated with theprevious opcode.

A sequence of packets that begins with code packet and includes all ofthe data packets that immediately follow the code packet is referred toas a command. A command can have zero or more parameters. Each parameteris an independent piece of data associated with the opcode in thecommand. The number of parameters expected depends on the opcode. Thefirst parameter of a command is simply encoded using data packetsfollowing a code packet. The first data packet of subsequent parametersis marked with the 10 opcode.

The interpretation of a command is dependent on two factors, the opcodeof the command, and the number of parameters included in the command. Inother words, a code packet has one meaning if it is immediately followedby another code packet, but the same packet can take on an entirelydifferent meaning if it is succeeded with data packets. Trace opcodesare shown in Table 1.

TABLE 1 000000 0000 No Information/End of Buffer 000000 0001 StartRepeat Single 000000 0010 PC Trace Gap 000000 0011 Register Repeat000000 0100 NOP SP loop 000000 0101 SPLOOP marker 000000 0110 TimingTrace Gap 000000 0111 Command Escape 000000 1000 Exception Occurred000000 1001 Exception Occurred with Repeat Single 000000 1010 BlockRepeat 0 000000 1011 Block Repeat 0 with Repeat Single 000000 1100 BlockRepeat 1 000000 1101 Block Repeat 1 with Repeat Single 000000 1110Memory Reference Trace Gap 000000 1111 Periodic Data Sync Point 0000010xxx Timing Sync Point 000001 1xxx Memory Reference Sync Point 000010xxxx PC Sync Point/First/Last/ 000011 000x PC Event Collision 000011001x Reserved 000011 01xx Reserved 000011 1xxx Reserved 00010x xxxxExtended Timing Data 00011x xxxx CPU and ASIC Data 0010xx xxxx Reserved001100 0000 Memory Reference Trace Gap (legacy 001100 0001 Periodic DataSync Point (legacy 0011xx xxxx Memory Reference Block 01xxxx xxxxRelative Branch Command/Register Branch 10xxxx xxxx Continue 11xxxx xxxxTiming

The timing trace gap code indicates that some timing trace informationis missing at this point. The timing trace remains invalid until theSynchronization code is found in the trace stream. The timing trace gapcode can be issued at any point.

It is permissible to have timing syncs included in a gap thusintroducing a discontinuity in the timing sync ID sequence.

Issuing of a timing gap command will cause a break in the PC decodingprocess until the next sync point.

The PC trace gap code indicates that some PC trace information ismissing at this point. This could occur for a number of reasons, suchas:

The trace queues in the target processor have overflowed before all ofthe data was transmitted.

A trace sync point was about to get an entire ID value (0-7) behindanother sync points.

A trace stream was about to send data commands in an order that violatedthe predefined rules. This

should be prevented by the encoding hardware.

The next PC trace information is a PC Synchronization code and the PCtrace remains invalid until the Synchronization code is found in the PCtrace stream. The PC Trace Gap code can only be issued at the naturalboundary between two packets or packet sequences.

It is permissible to have PC syncs included in a gap thus introducing adiscontinuity in the PC sync ID sequence.

SUMMARY OF THE INVENTION

A method is described to inject a ‘data gap’ marker into the tracestream with an accompanied count value that would indicate how much datawas lost. In a system that generates multiple trace streams (e.g.timing, PC, data, event), each stream could have a marker with eachincluding information specific to the context of that stream. Forinstance, a gap on a data trace stream would include information abouthow many transactions were lost. A gap on a timing stream would includeinformation on how many cycles were lost. A gap on a PC stream wouldinclude information on how many discontinuities were lost. A gap on anevent trace stream would include information about how many events orevent windows were lost.

In addition to including information communicating the amount of datathat may be lost, a throttling mechanism may be created to control theamount of trace data. Throttling may be implemented in a number of ways,two of which are shown:

Dead-Window Throttle—A dead window, the duration of which is userprogrammable, is opened when an internal FIFO reaches a certainthreshold. While the window is open, any data transaction that wouldnormally be forwarded to the trace encoding logic is blocked and a datagap is inserted in its place. The dead window expires once the userprogrammable duration expires.

Real-Time Throttle—In real time throttling, the utilization of the tracebus is monitored constantly. When the utilization exceeds a user definedthreshold, the data trace is either blocked completely (data gapmessages would be inserted in their place), or throttled using anothertechnique such as the dead-window throttle. When utilization is lessthan or equal to the user-defined threshold, data trace operatesnormally.

BRIEF DESCRIPTION OF THE DRAWINGS

These and other aspects of this invention are illustrated in thedrawings, in which:

FIG. 1 is a block diagram showing one embodiment of the invention; and

FIG. 2 shows a second embodiment incorporating the throttling functions.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

A debug trace source has the ability to generate a large amount of data.In some cases, the amount of data that needs to be generated will exceedthe bandwidth that is available at the time.

Internal FIFOs can be instantiated to help store transaction informationwhile it is waiting to be encoded and dispatched. Filtering andtriggering capabilities can be implemented to allow the user to betterrefine the rules for which a transfer should be traced. Regardless ofeither of these, there is a high risk that the trace hardware will beasked to trace something that can't be done due to bandwidthlimitations. This will ultimately result in the loss of trace data thatthe user may be unaware of.

In existing trace solutions implemented by Texas Instruments,encountering a scenario where data trace can't be encoded due tobandwidth restrictions results in a special marker being injected intothe trace stream at the next available slot to indicate that a ‘datagap’ has occurred. What's lacking is information related to how big thegap was-or how much data was lost.

In one embodiment of the invention shown in FIG. 1 a ‘data gap’ markeris injected into the trace stream with an accompanied count value thatwould indicate how much data was lost. In a system that generatesmultiple trace streams (e.g. timing, PC, load or store addresses, data,event), each stream may have a marker with each including informationspecific to the context of that stream. For instance, a gap on a datatrace stream would include information about how many transactions werelost. A gap on a timing stream would include information on how manycycles were lost. A gap on a PC stream would include information on howmany discontinuities were lost. A gap on an event trace stream wouldinclude information about how many events or event windows were lost.

As shown in FIG. 1, statistics and various events of interest are inputon line 101 to optional FIFO 103, and trace data is input on line 104 toFIFO 102. Both FIFOs 103 and 102 are connected to trace encoder andscheduler 107, with both the incoming data stream 104 and the internalstate of FIFO 102 also connected to overflow detect block 105. When anoverflow is detected by block 105, the excess data is counted by counterblock 106, with both the resulting count and overflow status beingcommunicated to trace encoder and scheduler 107.

Block 107 formats the trace stream, and outputs the results to tracebuss 108. In the case of an overflow, as indicated by block 105 a tracegap is generated communicating the amount of missing data.

Tracing the properties of a data transfer (master id, target address,data value) results in a large amount of data that does not compresswell being presented at once to the trace encoding hardware. Thiscoupled with the existing filtering and triggering capabilities resultsin a design that has a high risk of either gapping (dropping trace databecause of insufficient bandwidth) or not gapping and consumingexcessive amounts of bandwidth on the trace bus.

At the System on Chip (SoC) level, the trace bus is routed through atrace interconnect to one or more endpoints referred to as Trace Sinks.Within the trace interconnect there may be points of constrictionresulting from the merging of multiple trace streams or crossing into aclock-domain operating at a lower frequency. Such constriction pointsresult in problem areas for trace sources that require large amounts ofbandwidth.

Existing trace sources rely heavily on embedded triggering capabilitiesthat monitor key busses to determine a window or point that needs to betraced, essentially filtering the data as it comes in to limit what isultimately intended for the trace encode and scheduling logic. In theevent this logic can't keep up with the request, gapping messages aregenerated to indicate that trace information has been lost. At the SoClevel there may be a prioritization of trace streams at constrictionpoints in the trace interconnect, or the trace stream may be filteredout all together on its way to a given trace sync. What is missing isthe ability to keep data from being sent to the trace encode andscheduling logic based on temporal knowledge (only allow n-transactionsover a time span of m-clocks, or only allow 1-transaction to be tracedevery m-clocks) and the ability to use real-time throughput statisticsto prevent data from being encoded in an effort to reduce the amount ofbandwidth consumed by the trace bus.

In an other embodiment of the invention shown in FIG. 2 a throttlingmechanism is created to control the amount of trace data. While otherthrottling mechanisms are possible, FIG. 2 demonstrates the followingtwo:

Dead-Window Throttle: A dead window, the duration of which is userprogrammable, is opened when an internal FIFO reaches a certainthreshold or when a single transaction occurs. While the window is open,any data transaction that would normally be forwarded to the traceencoding logic is blocked and a data gap is inserted in its place. Thedead window expires once the user programmable duration expires.

Real-Time Throttle: In real time throttling, the utilization of thetrace bus is monitored constantly. When the utilization exceeds auser-defined threshold, data trace is either blocked completely (datagap messages would be inserted in their place), or throttled usinganother technique such as the dead-window throttle. When utilization isless than or equal to the user-defined threshold, data trace operatesnormally.

As shown in FIG. 2, statistics and various events of interest are inputon line 201 to optional FIFO 203, and trace data is input on line 204 toFIFO 202. Both FIFOs 203 and 202 are connected to trace encoder andscheduler 208, with both the incoming data stream 204 and the internalstate of FIFO 202 also connected to gap detect block 205, the output ofwhich is connected to trace encoder and scheduler 208.

Input data 204 also connects to dead window throttle 206 and real timethrottle 207, with trace bus 209 also connecting to real time throttle207.

The outputs from dead window throttle 206 and real time throttle 207connect to gap detect block 205 signaling a throttling requirement, andalso to FIFO 202 to control data input to trace encoder and scheduler208. The dead window and the real time throttles may be utilizedindependently or together.

1-9. (canceled)
 10. A debug trace gathering system comprising: a First In, First Out (FIFO) buffer memory having an input receiving trace data, an output of buffered data and a full output generating a FIFO full signal when said FIFO buffer memory is full; a counter connected to said FIFO buffer memory, said counter counting incoming data while said FIFO full signal indicates said FIFO buffer memory is full; and a trace encoder and scheduler connected to said FIFO buffer memory and said counter, said trace encoder and scheduler operable to output trace data corresponding to said buffered data while said FIFO full signal indicates said FIFO buffer memory is not full, not output trace data corresponding to said buffered data while said FIFO full signal indicates said FIFO buffer memory is full, and outputting a data gap marker upon said FIFO full signal indicating and said FIFO buffer is not full following an indication that said FIFO buffer is full, said data gap marker including an accumulated count of said counter while said FIFO full signal indicated said FIFO buffer memory was full indicating an amount of data lost.
 11. The debug trace gathering system of claim 10, further comprising: at least one further trace data stream; for each of said at least one further trace data stream a further FIFO buffer memory, a further counter and a further a trace encoder and scheduler
 12. A debug trace gathering system comprising: a First In, First Out (FIFO) buffer memory having an input receiving trace data, an output of buffered data and a full output generating a FIFO fullness signal indicating a current FIFO fullness state; a dead window throttle connected to said FIFO buffer memory, said dead window throttle operable to open a dead window for a predetermined period to time when FIFO fullness signals indicated a preprogrammed threshold of fullness; a trace encoder and scheduler connected to said FIFO buffer memory and said dead window throttle, said trace encoder and scheduler operable to output trace data corresponding to said buffered data while a dead window is not open, not output trace data corresponding to said buffered data while said dead window is open, and outputting a data gap marker upon said dead window opening.
 13. The debug trace gathering system of claim 12, wherein: said predetermined period of time said dead windows is open is user programmable.
 14. The debug trace gathering system of claim 12, further comprising: a gap detect unit connected to said FIFO buffer, memory, said dead window throttle and said trace encoder and scheduler, said gap detect unit operable to determine an amount of trace data received during while a dead window is open; and wherein said gap marker of said trace encoder and scheduler indicates said amount of trace data received during while a dead window is open.
 15. A debug trace gathering system comprising: a First In, First Out (FIFO) buffer memory having an input receiving trace data, an output of buffered data and a full output generating a FIFO fullness signal indicating a current FIFO fullness state; a real-time throttle unit having an input receiving said trace data, said real-tome throttle unit generating an active throttle signal when trace bus utilization exceeds a programmable threshold; a trace encoder and scheduler connected to said FIFO buffer memory and said dead window throttle, said trace encoder and scheduler operable to output trace data corresponding to said buffered data while said throttle signal is not active, not output trace data corresponding to said buffered data while said throttle signal is active, and outputting a data gap marker upon said throttle signal becoming inactive.
 16. The debug trace gathering system of claim 15, further comprising: a gap detect unit connected to said FIFO buffer, memory, said real-time throttle unit and said trace encoder and scheduler, said gap detect unit operable to determine an amount of trace data received during while a dead window is open; and wherein said gap marker of said trace encoder and scheduler indicates said amount of trace data received during while said throttle signal is active.
 17. A debug trace gathering system comprising: a First In, First Out (FIFO) buffer memory having an input receiving trace data, an output of buffered data and a full output generating a FIFO fullness signal indicating a current FIFO fullness state; a real-time throttle unit having an input receiving said trace data, said real-time throttle unit generating an active throttle signal when trace bus utilization exceeds a programmable threshold; a dead window unit opening a dead window upon an active throttle signal; a trace encoder and scheduler connected to said FIFO buffer memory and said dead window throttle, said trace encoder and scheduler operable to output trace data corresponding to said buffered data while a dead window is not open, not output trace data corresponding to said buffered data while said dead window is open, and outputting a data gap marker upon said dead window opening.
 18. The debug trace gathering system of claim 17, further comprising: a gap detect unit connected to said FIFO buffer, memory, said real-time throttle unit and said trace encoder and scheduler, said gap detect unit operable to determine an amount of trace data received during while a dead window is open; and wherein said gap marker of said trace encoder and scheduler indicates said amount of trace data received during while said throttle signal is active. 